google summer
Google Summer of Code 2022
These SMILES can be analysed using the RDKit library to get information about the atoms and bonds in the molecules. Molecular fingerprinting is a vectorized representation of molecules capturing precise details of atomic configurations. During the featurization process, a molecule is decomposed into substructures (e.g., fragments) of a fixed-length binary fingerprint assembled into an array whose each element is either 1 or 0. For this project, I implemented atomic and bond-level featurization and molecule-level (global) featurization in DeepChem, specific to D-MPNN model requirements. The D-MPNN paper [1] suggested 133 features for each atom and 14 features for each bond in a molecule. The individual features are extracted from SMILES using RDKit library and one-hot encoded to get vectorized representation.
The Aim Of Scikit-Learn 1.0 - AI Summary
It may sound a bit strange noting the fact that Scikit-Learn has been used by thousands of companies, data scientists, researchers… for a long time and everyone considers it as the most spread framework for general purpose Machine Learning. In this article, I do not want to make an analysis of the new features as many other articles do but to understand the aim of Scikit-Learn with this release and what is its strategy for future developments Scikit-Learn was born in 2007 first as a Google Summer of Code project and continued being developed in a researching environment. Its objective was to serve as a tool to make data analysis without having to focus on any particular technology or code. For this reason, it is based on Python, an open-source language, easy to use, general-purpose, and able to embed C code Another big problem when working with data is the computational resources in terms of memory and processing, so Scikit-Learn has always made a big effort in improving the algorithm efficiency to allow even the users with low computational resources to work with data. Scikit-Learn does it by using statistical approximations and low-level code (Cython). It may sound a bit strange noting the fact that Scikit-Learn has been used by thousands of companies, data scientists, researchers… for a long time and everyone considers it as the most spread framework for general purpose Machine Learning.
- Information Technology > Data Science (0.83)
- Information Technology > Software (0.63)
- Information Technology > Artificial Intelligence > Machine Learning (0.49)
scikit-learn/scikit-learn
The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. It is currently maintained by a team of volunteers. For running the examples Matplotlib 1.1.1 is required. CBLAS exists in many implementations; see Linear algebra libraries for known issues. The documentation includes more detailed installation instructions.